Finding Local Models in Classification (Extended Abstract)
نویسنده
چکیده
It is commonplace knowledge that more and more data is collected everywhere and that the size of data sets available for knowledge discovery is increasing exponentially. On the one hand this is good, because learning with high-dimensional data and complex dependencies may need a large number of examples to give accurate results. On the other hand, there are several learning problems which cannot be thoroughly solved by applying a standard learning algorithm to all available examples. While the accuracy of the learner typically increases with example size, there may be many other criteria for a learning algorithm like speed, robustness or comprehensibility which are negatively affected by too much examples. What can we do about this problem? Often one can efficiently find a simple model which provides not an optimal solution, but a reasonably good approximation. The hard work usually lies in improving an already good model. Hence, we can try to find a simple model first and then concentrate on improving only those parts of the input space, where the model is not good enough. This will be an easier task because less examples have to be considered and hence one might use a more sophisticated learner. In other words, one constructs not one single global model for all the data, but a global model plus one or more local models to cover special cases. There is an obvious connection between local models and the detection of local patterns, which is defined as the unsupervised detection of high-density regions in the data [Hand, 2002]: given a global model the goal is to identify structures, i. e. to find high-density regions, in the errors of this model. The difference is that here the goal is not just to identify these regions, but to use them to improve the overall performance, e. g. the classification accuracy. Besides the obvious improvements in efficiency, the local model approach has an interesting implication for the aspect of interpretability and interestingness: as humans are very limited in the level of complexity they can intuitively understand [Miller, 1956], a simple model may give the user an useful intuition without getting into too much detail. On the other hand, for the aspect of discovering new knowledge, it may happen that the global model finds only the obvious pat-
منابع مشابه
Robust Method for E-Maximization and Hierarchical Clustering of Image Classification
We developed a new semi-supervised EM-like algorithm that is given the set of objects present in eachtraining image, but does not know which regions correspond to which objects. We have tested thealgorithm on a dataset of 860 hand-labeled color images using only color and texture features, and theresults show that our EM variant is able to break the symmetry in the initial solution. We compared...
متن کاملInterval network data envelopment analysis model for classification of investment companies in the presence of uncertain data
The main purpose of this paper is to propose an approach for performance measurement, classification and ranking the investment companies (ICs) by considering internal structure and uncertainty. In order to reach this goal, the interval network data envelopment analysis (INDEA) models are extended. This model is capable to model two-stage efficiency with intermediate measures i...
متن کاملPresentation of quasi-linear piecewise selected models simultaneously with designing of bump-less optimal robust controller for nonlinear vibration control of composite plates
The idea of using quasi-linear piecewise models has been established on the decomposition of complicated nonlinear systems, simultaneously designing with local controllers. Since the proper performance and the final system close loop stability are vital in multi-model controllers designing, the main problem in multi-model controllers is the number of the local models and their position not payi...
متن کاملSurgical treatment outcome of giant cell tumor of distal ulna: En bloc resection vs. curettage and bone graft
Background: Giant cell tumor (GCT) of the bone is a benign neoplasm with local aggressive behavior. Distal ulna is a very rare place for GCT. Published studies have mainly focused on case reports, and thus there is no consistent treatment strategy for this tumor at this location. This retrospective study was conducted to evaluate the oncological and functional results of 2 di...
متن کاملA Novel Noise-Robust Texture Classification Method Using Joint Multiscale LBP
In this paper we describe a novel noise-robust texture classification method using joint multiscale local binary pattern. The first step in texture classification is to describe the texture by extracting different features. So far, several methods have been developed for this topic, one of the most popular ones is Local Binary Pattern (LBP) method and its variants such as Completed Local Binary...
متن کاملReview and Classification of Modeling Approaches of Soil Hydrology Processes
To use soil hydrology processe (SHP) models, which have increasingly extended during the last years, comprehensive knowledge about these models and their modeling approaches seems to be necessary. The modeling approaches can be categorized as either classical or non-classical. Classical approaches mainly model the SHP through solving the general unsaturated flow (Richards) equation, numerically...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004